Balancing spoken content adaptation and unit length in the recognition of emotion and interest
نویسندگان
چکیده
Recognition and detection of non-lexical or paralinguistic cues from speech usually uses one general model per event (emotional state, level of interest). Commonly this model is trained independent of the phonetic structure. Given sufficient data, this approach seemingly works well enough. Yet, this paper addresses the question on which phonetic level there is the onset of emotions and level of interest. We therefore compare phoneme-, wordand sentence-level analysis for emotional sentence classification by use of a large prosodic, spectral, and voice quality feature space for SVM and MFCC for HMM/GMM. Experiments also take the necessity of ASR into account to select appropriate unit-models. In experiments on the well-known public EMO-DB database, and the SUSAS and AVIC spontaneous interest corpora, we found that the emotion recognition by sentence level analysis shows the best results. We discuss the implications of these types of analysis on the design of robust emotion and interest recognition of usable human-machine interfaces (HMI).
منابع مشابه
Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملStatistical Variation Analysis of Formant and Pitch Frequencies in Anger and Happiness Emotional Sentences in Farsi Language
Setup of an emotion recognition or emotional speech recognition system is directly related to how emotion changes the speech features. In this research, the influence of emotion on the anger and happiness was evaluated and the results were compared with the neutral speech. So the pitch frequency and the first three formant frequencies were used. The experimental results showed that there are lo...
متن کاملThe Relation of Attachment Styles, Emotion Regulation, and Resilience to Well-being among Students of Medical Sciences
Introduction: Psychological well-being, reflecting positive mood, vitality, and interest in milieu, is a part of quality of life psychology. Attachment styles could be theoretically linked to well-being through stress appraisal patterns that include emotion regulation and resilience. Researchers believe that attachment, resilience, and emotion regulation have generally been identified as import...
متن کاملSpeech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملA Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008